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FIELD OF THE INVENTION 

The invention relates to a method and arrangement for embedding 
supplemental data in a digital video signal. The invention also relates to an arrangement for 
decoding the embedded supplemental data. 

5 BACKGROUND OF THE INVENTION 

Video and audio signals are increasingly transmitted and recorded in a 

digitally encoded form, for example, an MPEG bit stream. There is a growing need to 

accommodate supplemental data in the signal, for example, a watermark to classify the signal 

as authentic program material. Watermarking digital signals is particularly useful in copy 
10 protection applications. The watermark can effectively take the form of a single bit indicating 

that the signal constitutes copy protected material, or a multi bit code representing the 

originator of the material. 

In the known MPEG standard for audio and video compression a copy 

protection bit has been defined for that purpose. However, a disadvantage of this known 
15 method is that the protection bit can easily be modified to circumvent the copyright 

protection mechanism. 

OBJECT AND SUMMARY OF THE INVENTION 

It is an object of the invention to provide a method and arrangement for 
20 embedding a watermark in a video signal in such a manner that the embedded watermark can 
easily be detected but is difficult to remove. 

To this end. the invention provides a method of embedding supplemental 
data in a video signal comprising the step of encoding the video signal in groups of pictures 
comprising an intraframe (I) coded picture and a series of predictively (P) and bidirectionally 
25 predictively (B) coded pictures, characterized by encoding the video signal in such a manner 
that the pattern of picture coding types in a group of pictures (GOP) represents a 
supplemental data value. 

With the invention is achieved that a watermark can easily be detected. 
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The picture coding types are accommodated in the picture headers of an MPEG bit stream 
and can easily be read. However, changing the picture coding type in a picture header to 
remove the watermark renders the picture data no longer compliant with the coding standard. 
The MPEG bit stream can no longer be decoded by a compliant decoder. The relevant 

5 picture must be transcoded to comply with the new picture coding type, e.g. by decoding the 
picture and encoding it again. 

It should be noted that the general idea of generating a predetermined 
sequence of I, P and B pictures in an MPEG signal so as to mark a digital video signal has 
also been proposed in Applicant's International Patent Application WO 97/13248. However, 

10 this application was published after the priority date of the present invention and fails to 
disclose the representation of a supplemental data value by a pattern of picture coding types 
within a group of pictures. 

Preferably, the supplemental data value is represented by a given pattern 
of B and P picture coding types in a GOP, for example, by the position of a BPP pattern in a 

15 GOP. Herewith is achieved that changing a picture coding type also changes its reference to 
other pictures within the GOP and, consequently, ripples through the remainder of the GOP. 
To remove a watermark, a substantial number of pictures in the GOP must now be 
transcoded rather than a single picture. There is one exception: a P picture can be transcoded 
into an I picture without requiring other pictures to be transcoded as well. However, the I 

20 picture must then be encoded with the low amount of bits used for the P picture. This affects 
the quality of the I picture as well as any P picture referring to this I picture. Consequently, 
a watermark cannot be removed from a GOP without either transcoding the remainder of the 
GOP or suffering severe decrease in quality for the remainder of the GOP. 

25 BRIEF DESCRIPTION OF THE DRAWINGS 

Figs. 1-4 show examples of GOP structures of an MPEG encoded video 
signal to illustrate the method of embedding supplemental data in accordance with the 
invention. 

Fig. 5 shows an example of assigning different supplemental data values 
30 to respective positions of a BPP pattern in a GOP. 

Fig. 6 shows a schematic diagram of an arrangement for embedding 
supplemental data in an MPEG video signal in accordance with the invention. 

Fig. 7 shows a flowchart illustrating the operation of a control circuit 
which is shown in Fig. 6. 
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Fig. 8 shows a schematic diagram of an arrangement tor decoding 
supplemental data embedded in an MPEG encoded the video signal in accordance with the 
invention. 



5 DESCRIPTION OF EMBODIMENTS 

First, the basic principles of MPEG which are essential to the 
watermarking method in accordance with the invention will be briefly described. 

To achieve efficient video compression, an MPEG encoder encodes 
pictures in accordance with one of three different coding methods. Some pictures are 

10 autonomously encoded, i.e. without any reference to another picture in the video sequence. 
These pictures are denoted intraframe coded pictures or I pictures. Other pictures are 
gredictively encoded, using a motion-compensated previous picture as reference (prediction) 
image. They are denoted P pictures. The previous picture to which a P picture refers may be 
an I picture or another P picture. Yet other pictures are bidirectionally predictively encoded. 

15 They refer to "a previous as well as a future I or P picture and are denoted B pictures. 

Generally, the amount of bits required to represent a picture is most for I 
pictures, less for P pictures and least for B pictures. The amount of compression and the 
quality of the decoded video sequence largely depends on the performance of the motion 
estimation process in the encoder. Motion estimation is the most complicated and 

20 computational intensive operation of an MPEG encoder. It is this operation which will make 
professional MPEG encoders far superior over cost-effective consumer encoders for a long 
time. 

In order to inform an MPEG decoder whether a received picture is an I, P 
or B picture, a parameter picture jype in each picture header of an MPEG video bit stream 
25 describes how the relevant picture has been encoded. If the picture coding type is I, the 
decoder reconstructs the picture completely from the received picture data. If the picture 
coding type is P, the decoder reconstructs the picture from the received picture data and an 
already displayed I or P picture. If the picture coding type is B, it is reconstructed on the 
basis of a preceding as well as a succeeding I or P picture. It should be noted that the 
30 parameter picture jype implicitly specifies the reference picture(s): a P picture refers to the 
most recent I or P picture, a B picture refers to the most recent and the next I or P picture. 

A series of an I picture and consecutive B and P pictures are called a 
Group of Pictures (GOP). According to the MPEG standard, an encoder is free to choose the 
optimum sequence of I, B and P picture coding types. However, only a few GOP-structures 
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are used in practice: 

IPPP.. is used by cheap encoders that do not have access to large amounts 

of memory. 

IBPBP.. is used by more advanced encoders. 
5 . IBBPBBP.. is commonly used by professional encoders. 

MPEG encoders currently under development optimize the GOP structure 
a little further than the conventional sequences listed above, usually in that an I picture is 
chosen when a hard scene change occurs. 

Fig. 1 shows an example of the commonly used IBBPBBP.. structure of a 
10 GOP comprising 12 pictures 1,2,. .12. The arrows shown in the Figure point to the relevant 
reference picture(s). For example, B pictures 2 and 3 have been encoded using 1 picture 1 
and P picture 4 as prediction, B pictures 5 and 6 have been encoded using P picture 4 and P 
picture 7 as reference, etc.. Similarly, P picture 4 has been encoded using I picture 1 as 
prediction. P picture 7 has been encoded using the P picture 4 as prediction, etc.. Note that 
15 the pictures are shown in display order. The transmission order is different because for 
decoding a B picture the decoder must already have a future P or I picture at its disposal. 

As Fig. 1 shows, the commonly used GOP structure comprises a plurality 
of BBP patterns. The pattern BPP is rarely used in a GOP. In a preferred embodiment of the 
present invention, it is this pattern which is used to represent watermark data. Fig. 2 shows a 
20 GOP including such a BPP pattern (the pictures 5, 6 and 7). 

A watermark in the form of a BPP pattern can easily be detected because 
the picture coding type is included in the respective picture header. However, it is impossible 
to remove the watermark by merely changing the parameter picture jype. For example, if the 
parameter picture jype of picture 6 in Fig. 2 is changed from P into B as shown in Fig. 3. a 
25 decoder will decode this picture with reference to P picture 4 and P picture 7, whereas the 
encoder used P picture 4 as the prediction image only. Picture 7 will neither be correctly 
decoded because it now refers to P picture 4 whereas the encoder referred to picture 6. 
Needless to say that the decoder will fail, or at least produce erroneous results. 

Similarly, if the watermark is removed by changing the parameter 
30 picture jype of picture 7 in Fig. 2 from P into B as shown in Fig. 4, the decoder will decode 
picture 7 with reference to P pictures 6 and 10, whereas the encoder made reference to 
picture 6 only. The pictures 8, 9 and 10 will neither be decoded correctly because their 
original references to P picture 7 have been changed into references to P picture 6 
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Neither can the parameter picture jype of P picture 6 or P picture 7 be 
changed from P into I because, in that case, a predictively encoded picture is then interpreted 
as an autonomously encoded picture (pixel differences are interpreted as pixels). 

Accordingly, in order to remove a watermark, the relevant picture has to 
5 be transcoded, i.e. decoded into the pixel domain and encoded again in accordance with its 
modified picture coding type. That is not attractive for a hacker because, as mentioned 
before, high-quality encoding involves complicated motion estimation circuitry, unless a 
severe degradation of quality is accepted. In this respect, it is to be noted that not only the 
picture whose picture jype parameter has been changed has to be reencoded. Pictures 
10 referring to the modified picture are to be reencoded as well. For example, if picture 7 in 

Fig. 2 is transcoded from a P picture into a B picture as shown in Fig. 4, the B pictures 8, 9 
and the P picture 10 will also have to be transcoded because their references have changed. 
Thus, the effect of transcoding ripples through the remainder of the GOP, unless the relevant 
P picture is transcoded into an I picture. However, in that case the I picture must be 
15 compressed into the same (relatively low) number of bits as originally spent to the P picture. 

The occurrence of two consecutive P pictures rarely occurs by accident. 
The number of false alarms (watermark detected in a non-watermarked signal) is thus 
limited. To further reduce the false alarm possibility, a requirement can be imposed on the 
maximal amount of GOPs between two watermarked GOPs. For example, a video stream is 
20 specified to be copyright protected if watermarked GOPs occur in small enough intervals. 

The above described concept of watermarking a GOP allows messages of 
any length to be embedded in an MPEG video signal. To this end. different supplemental 
data values are assigned to different positions of the BPP pattern in the GOP. A first example 
thereof is shown in Fig. 5. In this example, a GOP 20 which starts with the BPP pattern 
25 represents a sync code to indicate the start of a message. A GOP 21 with the BPP pattern 
after a single P represents a binary supplemental data value "0". A GOP 22 with the BPP 
pattern after two Ps represents a binary w r\ The reference numeral 23 denotes an MPEG 
encoded video signal segment with an embedded message "0110..". Note that not each GOP 
conveys a supplemental data value (most of the GOPs have the common IBBPBBP.. 
30 structure) in view of the fact that watermarking affects the encoding efficiency. It is notably 
advantageous to embed a supplemental data value in every n lh GOP (n being a predetermined 
integer) to assist a watermark detector in identifying the relevant GOPs and to reduce the 
false alarm rate. Note also that the GOPs in video signal 23 have variable lengths. Not only 
may the number of pictures in a GOP vary, the number of bits per picture also depends 
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15 



20 



25 



largely on the image contents. 

It will be appreciated that the alphabet of supplemental data values can be 
further enlarged. For example, six different message symbols 0-5 may be assigned to GOP 
structures in accordance with the following Table I: 



30 



symbol 


GOP structure 


0 


1BPPBBPBB.. 


1 


1BBPPBPBB.. 


2 


IBBPBPPBB.. 


3 


IBBPBBPPB.. 


4 


IBBPBBPBPPBB.. 


D 


IBBPBBPBBPPB. . 



Fig. 6 shows a schematic diagram of an arrangement for embedding 
supplemental data in an MPEG video signal in accordance with the invention. The 
arrangement includes a conventional MPEG video encoder 30 and a control circuit 40. The 
MPEG video encoder 30 is shown in more details, however only to an extent necessary to 
understand the invention. More particularly, the encoder comprises a subtractor 31 which 
receives a video signal x to be encoded, and subtracts therefrom a prediction signal x. The 
difference signal is subjected to discrete cosine transform, quantization and variable length 
coding (in the Figure collectively denoted 32). The MPEG encoded output signal v is 
transmitted to a receiver or stored on a storage medium (not shown). It is also locally 
decoded by a local decoder 33 and. through an adder 34, applied to a motion estimation and 
compensation circuit 35. Said motion estimation and compensation circuit provides a forward 
predicted picture and a bidirectional predicted picture. 

The three MPEG encoding modes (I, P, B) are symbolized in Fig. 6 by a 
selection switch 36 which selects the prediction signal x applied to the subtractor 31. The 
selection switch has three input terminals denoted I, P and B. If the I-terminal is selected, the 
prediction signal a* is zero which results in that the input signal x will autonomously be 
encoded. If the P-terminal is selected, the forward predicted picture is applied to the 
subtractor. If the B-terminal is selected, the bidirectional predicted picture is applied to the 
subtractor. The selected prediction signal x is also fed back, into the motion estimation and 
compensation circuit 35 through the adder 34. 

The current encoding mode (I, P, B) is controlled by the control circuit 40 
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which controls the selection switch 36 through a picture coding type signal PT in accordance 
with a received watermark message w to be embedded. Fig. 7 shows a flowchart illustrating 
the operation of said control circuit. In this example it is assumed that a watermark symbol 
Wj (Wj = 0..5) is to be embedded in every 8 th GOP in accordance with Table I described 

5 hereinabove. In a step 50, it is determined whether the current GOP is the 8 lh GOP. If that is 
not the case, the control signal PT = IBBPBBP.. is generated in a step 51. In response 
hereto, the MPEG encoder produces the conventional GOP structure. However, for every 8 Ul 
GOP, the next watermark symbol w, of the watermark message w to be embedded is read in 
a step 52. Then the GOP structure assigned to said symbol is looked-up in a memory in 

10 which Table I is stored, and the corresponding control signal PT = I..BPP.. is generated in 
a step 53. 



supplemental data embedded in an MPEG encoded the video signal in accordance with the 
invention. The arrangement comprises a GOP detector 61, a picture header detector 62. a 
15 window generator 63. a gate 64, a shift register 64 and a look-up-table 65. The picture 
header detector 62 detects the presence of a predetermined 32-bits picture jstartjcode (the 
hexadecimal value 00000100) in the MPEG signal and applies a picture header signal PHDR 
to the window generator 63. In response hereto, the window generator 63 generates a timing 
window W. The window W opens the gate 64 each time an MPEG parameter 



20 picture _coding_rype is received and causes said parameter to be written into the shift register 
65. The parameter indicates whether the current picture is intraframe coded (I), predictively 
coded (P) or bidirectionally predictively coded (B). The GOP detector 61 detects the 
presence of a further predetermined 32-bits group _start_code (the hexadecimal value 
00000 1B8) in the MPEG signal which indicates the start of a group of pictures. In response 

25 hereto, the detector activates the look-up-table 66 to convert the current pattern of picture 
coding types PTRN in shift register 65 into a supplemental data value w r In addition, the 
GOP detector resets the shift register 65 so as to start collecting the pattern of picture coding 
types for the next GOP. 



30 signal is disclosed. An MPEG encoded video signal includes groups of pictures (GOPs), each 
GOP comprising an intraframe coded (I) picture and a series of predictively encoded (P) 
pictures and bidirectionally predictively (B) pictures. Usually, the GOP structure IBBPBBP... 
is used. In accordance with the invention, the video signal is watermarked by forcing the 
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Fig. 8 shows a schematic diagram of an arrangement for decoding 



In summary, a method of embedding a watermark in an MPEG encoded video 
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MPEG encoder to produce a GOP structure which does normally not occur, e.g. a GOP 
including a BPP sequence. Different symbol values can be assigned to different positions of 
the BPP sequence in the GOP. 
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Claims 



1. A method of embedding supplemental data in a video signal comprising the 
step of encoding the video signal in groups of pictures comprising an intraframe (I) coded 
picture and a series of predictively (P) and bidirectionally predictively (B) coded pictures, 
characterized by encoding the video signal in such a manner that the pattern of picture 

5 coding types (I, P, B) in a group of pictures represents a supplemental data value. 

2. The method as claimed in claim 1, wherein the supplemental data value is 
represented by the position of a BPP pattern in a group of pictures. 

3. An arrangement for embedding supplemental data in a video signal comprising 
means (30) for encoding the video signal in groups of pictures comprising an intraframe (I) 

10 coded picture and a series of predictively (P) and bidirectionally predictively (B) coded 

pictures, characterized by means (40) for encoding the video signal in such a manner that the 
pattern of picture coding types (I, P, B) in a group of pictures represents a supplemental data 
value. 

4. The arrangement as claimed in claim 3, wherein the supplemental data value is 
15 represented by the position of a BPP pattern in a group of pictures. 

5. A method of decoding supplemental data embedded in a video signal encoded 
in groups of pictures comprising an intraframe (I) coded picture and a series of predictively 
(P) and bidirectionally predictively (B) coded pictures, characterized by the steps of reading 
the picture coding types (I,P,B) in a group of pictures and determining a supplemental data 

20 value represented by the pattern of picture coding types in said group of pictures. 

6. The method as claimed in claim 5, wherein the supplemental data value is 
represented by the position of a BPP pattern in the group of pictures. 

7. An arrangement for decoding supplemental data embedded in a video signal 
encoded in groups of pictures comprising an intraframe (I) coded picture and a series of 

25 predictively (P) and bidirectionally predictively (B) coded pictures, characterized by means 
for reading the picture coding types (I,P,B) in a group of pictures and determining a 
supplemental data value represented by the pattern of picture coding types in said group of 
pictures. 

8. The arrangement as claimed in claim 7, wherein the supplemental data value is 



BNSDOCID: <WO 9831 152A2_I_> 



WO 98/31152 




PCT/TB98/00026 



10 



represented by the position of a BPP pattern in the group of pictures. 

9. An encoded video signal with embedded supplemental data, comprising groups 
of intraframe (I), predictiveiy (P) and bidirectionally predictively (B) coded pictures, 
characterized in that supplemental data values are represented by respective predetermined 

5 patterns of picture coding types (I, B and P) in a group of pictures. 

10. The signal as claimed in claim 9, wherein the supplemental data value is 
represented by the position of a BPP pattern in a group of pictures. 

11. A storage medium on which an encoded video signal with embedded 
supplemental data as claimed in claim 9 or 10 is stored. 
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